
Usetutoringspotscode to get 8% OFF on your first order!

AUT University Certificate in Foundation Studies

AUT University Certificate in Foundation Studies
Delivered by ACG Norton College
NAME:…………………………………. ID:………………………..
Due: Thursday 26th November, 2015
Mark Scheme
1. Assignment questions 75
R Test 15
1. All parts of your assignment MUST be word processed. Any part
written in ink or pencil will be ignored!
2 Label all graphs appropriately, and give each graph a suitable main
3 Show ALL workings and R output.
4 Round all calculations sensibly.
5 Assignments handed in late will be receive 0%
Section A. This section is to be completed using ONLY your calculator for the required
Question 1 [10]
(a) Ages from a group of athletes are approximately normal X ?N(33.1yrs,2.3yrs). Apply the
68-95-99.7 Rule to determine the interval in which the middle 99.7% of all ages will fall.
Using Z tables
(b) Determine what percentage of athletes’ ages will be below 31 years. [2]

(c) If a sample of 1700 ages are taken from the athletes, determine how many athletes will have
an age above 34 years? [3]
(d) Find the upper quartile for the athletes’ ages. [3]
Question 2 [25]
(a) Explain the terms non-response bias and response bias in sampling. Give an example of each, not
the same as those in your notes. [4]

(b) Name some factors that make a successful questionnaire? [2]
(c) Define the term “sampling frame”. [1]
(d) What is an undercount in a census? [2]
(e) 1.Describe how to use a calculator to randomly select numbers in a range from 01 to 60. [2]
2. The heights, in cm, of a community of 60 people are collected in the table below.

i) Use Table B of random digits to undertake a simple random sample to select 8
peoples’ heights from the table above. Start at the beginning of row 105 reading them
continuously from left to right across the row. Place your results below. [2]
ii) Evaluate your sample mean height. [2]
3. In the population of 60 heights, 25 are from females. Fully describe how you would
complete a stratified random sample of size 20 with respect to gender. You do not
need to do the sample. [5]
Person #
4. A systematic sample of size 6 is to be undertaken from the population of heights. Explain
fully how a systematic sampling procedure is conducted if a random starting position at the
height numbered 16 is chosen. List the six heights selected. [3]
5. State one advantage and one disadvantage of a census of all 60 people in the community.
Question 3 [10]
The variable self.concept can be found in the data set EduData,
Blackboard- RData- EduData.txt
(a) Explain fully in what way the distribution for self.concept is non-Normal. You must use and
include a Normal Quantile plot. [4]
(b) What is granularity? Explain if granularity is present in this data set? [2]

(c) Use R Commander to provide proof that the observations of self.concept was taken from a
non-Normal distribution.
Produce another graph as well as summary statistics to form part of the proof. The graph
must have suitable labels and titles. Write a small paragraph on your findings. [4]
Section B. This section is to be completed using ONLY R Commander for the required
Question 4 [30]
a) A random sample surveyed 78 Year Five students at a large school and the researcher recorded
several variable values for each student. A linear relationship between two of the variables shown
below was investigated:
Variable Description
NSL National Standard Literacy – a numeric academic measure.
IQ Intelligence Quotient- a numeric intelligence measure.
Linear Regression output from R
lm(formula = NSL ~ IQ, data = NSL)
Min 1Q Median 3Q Max
-6.3182 -0.5377 0.2178 1.0268 3.5785
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.55706 1.55176 -2.292 0.0247 *
IQ 0.10102 0.01414 7.142 4.74e-10 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Residual standard error: 1.635 on 76 degrees of freedom
Multiple R-squared: 0.4016, Adjusted R-squared: 0.3937
F-statistic: 51.01 on 1 and 76 DF, p-value: 4.737e-10

i) Identify the response variable from the regression output. [1]
ii) The minimum residual value from the output is -6.3182. Indicate on the scatterplot which
data point this is, by circling the point. [1]
iii) Calculate the correlation coefficient for the relationship. [2]
iv) Describe the relationship between NSL level and IQ. Include any unusual features. [4]
v) The linear regression equation for estimating the NSL level from IQ of a student is
NSL = 0.101IQ – 3.557 (coefficients are rounded to 3 decimal places)
State a limitation of this model equation in predicting the NSL levels of students. [2]

vi) Interpret the gradient of the regression line equation. Include units. [3]
vii) Use the model equation, NSL = 0.101IQ – 3.557 to estimate the NSL level of a student with
an IQ of 115. [3]
viii) The mean of the variable IQ is 108.9. Use this result to find the mean of the variable NSL.
Explain your method. [3]

ix) If the two variables, NSL and IQ, were interchanged (swapped), explain in general the effect
on the equation of the regression line and the R-squared value for the new relationship. [3]
Regression Equation:
x) 1) State the value of the coefficient of determination by referring to the R output. [1]
2) Explain the meaning of this value in the context of these variables. [2]
xi) A pilot survey of the Year Five students was undertaken before the main sampling exercise. The
results are in the table for six individuals:

1) Use your calculator to find the correlation coefficient for the relationship. [1]

2) Assuming a linear model, with IQ as the explanatory variable, the equation for the least
squares regression line for the relationship is:
? = ?. ????? – ?. ??? (coefficients rounded to 4SF)

Find the residual (prediction error) for the observed value (105, 8.4) [2]
3) Produce a scatterplot for the relationship. [2]
IQ 90 100 105 107 112 126
NSL 5.3 6 8.4 7.2 8 9.1

Responses are currently closed, but you can trackback from your own site.

Comments are closed.

AUT University Certificate in Foundation Studies

AUT University Certificate in Foundation Studies
Delivered by ACG Norton College
NAME:…………………………………. ID:………………………..
Due: Thursday 26th November, 2015
Mark Scheme
1. Assignment questions 75
R Test 15
1. All parts of your assignment MUST be word processed. Any part
written in ink or pencil will be ignored!
2 Label all graphs appropriately, and give each graph a suitable main
3 Show ALL workings and R output.
4 Round all calculations sensibly.
5 Assignments handed in late will be receive 0%
Section A. This section is to be completed using ONLY your calculator for the required
Question 1 [10]
(a) Ages from a group of athletes are approximately normal X ?N(33.1yrs,2.3yrs). Apply the
68-95-99.7 Rule to determine the interval in which the middle 99.7% of all ages will fall.
Using Z tables
(b) Determine what percentage of athletes’ ages will be below 31 years. [2]

(c) If a sample of 1700 ages are taken from the athletes, determine how many athletes will have
an age above 34 years? [3]
(d) Find the upper quartile for the athletes’ ages. [3]
Question 2 [25]
(a) Explain the terms non-response bias and response bias in sampling. Give an example of each, not
the same as those in your notes. [4]

(b) Name some factors that make a successful questionnaire? [2]
(c) Define the term “sampling frame”. [1]
(d) What is an undercount in a census? [2]
(e) 1.Describe how to use a calculator to randomly select numbers in a range from 01 to 60. [2]
2. The heights, in cm, of a community of 60 people are collected in the table below.

i) Use Table B of random digits to undertake a simple random sample to select 8
peoples’ heights from the table above. Start at the beginning of row 105 reading them
continuously from left to right across the row. Place your results below. [2]
ii) Evaluate your sample mean height. [2]
3. In the population of 60 heights, 25 are from females. Fully describe how you would
complete a stratified random sample of size 20 with respect to gender. You do not
need to do the sample. [5]
Person #
4. A systematic sample of size 6 is to be undertaken from the population of heights. Explain
fully how a systematic sampling procedure is conducted if a random starting position at the
height numbered 16 is chosen. List the six heights selected. [3]
5. State one advantage and one disadvantage of a census of all 60 people in the community.
Question 3 [10]
The variable self.concept can be found in the data set EduData,
Blackboard- RData- EduData.txt
(a) Explain fully in what way the distribution for self.concept is non-Normal. You must use and
include a Normal Quantile plot. [4]
(b) What is granularity? Explain if granularity is present in this data set? [2]

(c) Use R Commander to provide proof that the observations of self.concept was taken from a
non-Normal distribution.
Produce another graph as well as summary statistics to form part of the proof. The graph
must have suitable labels and titles. Write a small paragraph on your findings. [4]
Section B. This section is to be completed using ONLY R Commander for the required
Question 4 [30]
a) A random sample surveyed 78 Year Five students at a large school and the researcher recorded
several variable values for each student. A linear relationship between two of the variables shown
below was investigated:
Variable Description
NSL National Standard Literacy – a numeric academic measure.
IQ Intelligence Quotient- a numeric intelligence measure.
Linear Regression output from R
lm(formula = NSL ~ IQ, data = NSL)
Min 1Q Median 3Q Max
-6.3182 -0.5377 0.2178 1.0268 3.5785
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.55706 1.55176 -2.292 0.0247 *
IQ 0.10102 0.01414 7.142 4.74e-10 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Residual standard error: 1.635 on 76 degrees of freedom
Multiple R-squared: 0.4016, Adjusted R-squared: 0.3937
F-statistic: 51.01 on 1 and 76 DF, p-value: 4.737e-10

i) Identify the response variable from the regression output. [1]
ii) The minimum residual value from the output is -6.3182. Indicate on the scatterplot which
data point this is, by circling the point. [1]
iii) Calculate the correlation coefficient for the relationship. [2]
iv) Describe the relationship between NSL level and IQ. Include any unusual features. [4]
v) The linear regression equation for estimating the NSL level from IQ of a student is
NSL = 0.101IQ – 3.557 (coefficients are rounded to 3 decimal places)
State a limitation of this model equation in predicting the NSL levels of students. [2]

vi) Interpret the gradient of the regression line equation. Include units. [3]
vii) Use the model equation, NSL = 0.101IQ – 3.557 to estimate the NSL level of a student with
an IQ of 115. [3]
viii) The mean of the variable IQ is 108.9. Use this result to find the mean of the variable NSL.
Explain your method. [3]

ix) If the two variables, NSL and IQ, were interchanged (swapped), explain in general the effect
on the equation of the regression line and the R-squared value for the new relationship. [3]
Regression Equation:
x) 1) State the value of the coefficient of determination by referring to the R output. [1]
2) Explain the meaning of this value in the context of these variables. [2]
xi) A pilot survey of the Year Five students was undertaken before the main sampling exercise. The
results are in the table for six individuals:

1) Use your calculator to find the correlation coefficient for the relationship. [1]

2) Assuming a linear model, with IQ as the explanatory variable, the equation for the least
squares regression line for the relationship is:
? = ?. ????? – ?. ??? (coefficients rounded to 4SF)

Find the residual (prediction error) for the observed value (105, 8.4) [2]
3) Produce a scatterplot for the relationship. [2]
IQ 90 100 105 107 112 126
NSL 5.3 6 8.4 7.2 8 9.1

Responses are currently closed, but you can trackback from your own site.

Comments are closed.

Powered by WordPress | Designed by: Premium WordPress Themes | Thanks to Themes Gallery, Bromoney and Wordpress Themes